Amendments to the Claims : 

This listing of claims will replace all prior versions, and listings, of claims in 
the application: 

Please amend the claims as follows: 

1 . (presently amended) A method for performing a gather operation on a 
general purpose computer processor comprising: 

computing addresses for a plurality of data elements of a matrix stored in 
memory, wherein: 

each data element is identified by one of an equal plurality of 
indices and a base address; and 

computing addresses comprises: 

executing an equal plurality of EXTRACT instructions to 
transfer a plurality of said indices from a first storage location 
where the indices are stored substantially contiguously, to an equal 
plurality of separate storage locations, wherein each index is 
assigned its own separate storage location; and 

adding said base address to each index , wherein each 
addition of said base address to each index is independent of one 
another : 

retrieving each of said plurality of data elements from memory based on 
the computed addresses; and 

executing an equal plurality of DEPOSIT instructions, each DEPOSIT 
instruction depositing one or more of said data elements contiguously with other 
data elements in a general purpose register. 
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2. (previously presented) The method as in claim 1 wherein said storage 
locations are general purpose registers within a general purpose processor. 

3. (cancelled) 

4. (previously presented) The method as in claim 1 further comprising: 
loading each of said data elements from memory into separate storage 

locations prior to executing said second plurality of instructions. 

5. (previously presented) The method as in claim 1 wherein said 
computer processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

6. (original) The method as in claim 1 further comprising: 
storing each of said data elements on a mass storage device. 

7. (original) The method as in claim 2 wherein said registers are 64-bits 
wide and said data elements are 16-bits in length. 

8. (presently amended) A method for performing a scatter operation on a 
general purpose computer processor comprising: 

executing a first plurality of EXTRACT instructions to extract indices for 
each of a plurality of data elements, the indices being extracted into separate 
storage locations; 

using the extracted indices to calculate addresses in memory to which 
said plurality of data elements are to be scattered to form a matrix in memory 
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wherein each address in memory is identified by one of a plurality of indices and 
a base address, and further wherein each address in memory is calculated by 
adding said base address to each index of said plurality of indices , wherein each 
addition of said base address to each index is independent of one another ; 

executing a second plurality of EXTRACT instructions, each of said 
EXTRACT instructions extracting one or more of said data elements from a 
general purpose register in which said data elements are stored contiguously to 
an equal plurality of separate storage locations; and 

transferring said data elements from said separate storage locations to 
said calculated addresses in memory. 

9. (previously presented) The method as in claim 8 wherein each of said 
storage location is a general purpose register. 

10. (cancelled) 

1 1 . (previously presented) The method as in claim 8 wherein storing each 
of said data elements is accomplished via a plurality of STORE instructions 
executed by said computer processor. 

12. (previously presented) The method as in claim 8 wherein said 
computer processor executes two or more of said instructions in a single clock 
cycle. 

13. (original) The method as in claim 9 wherein said register is 64-bits 
wide and said data elements are 16-bits in length. 
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14. (presently amended) A computer system comprising: 
a memory; 

a general purpose processor communicatively coupled to the memory; 

and 

a storage device communicatively coupled to the processor and having 
stored therein a sequence of instructions which, when executed by the 
processor, causes the processor to at least, 

compute addresses for a plurality of data elements of a matrix 
stored in memory, wherein: 

each data element is identified by one of an equal plurality of 
indices and a base address; and 
computing addresses comprises: 

executing an equal plurality of EXTRACT instructions 
to transfer a plurality of said indices from a first storage 
location where the indices are stored substantially 
contiguously, to an equal plurality of separate storage 
locations, wherein each index is assigned its own separate 
storage location; and 

adding said base address to each index , wherein 
each addition of said base address to each index is 
independent of one another ; 
retrieve each of said plurality of data elements from memory based on the 
computed addresses; and 
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execute an equal plurality of DEPOSIT instructions, each deposit 
instruction depositing one or more of said data elements contiguously with other 
data elements in a general purpose register. 

15. (previously presented) The computer system as in claim 14 wherein 
said storage locations are general purpose registers. 

16. (cancelled) 

17. (previously presented) The computer system as in claim 14 wherein 
said processor loads each of said data elements from memory into separate 
storage locations prior to executing said second plurality of instructions. 

18. (previously presented) The computer system as in claim 17 wherein 
said processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

19. (original) The computer system as in claim 14 wherein, responsive to 
one or more instructions in said sequence, said processor further: 

stores each of said data elements on said mass storage device. 

20. (original) The computer system as in claim 15 wherein said registers 
are 64-bits wide and said data elements are 16-bits in length. 

21. (previously presented) A method as in claim 1 wherein computing 
addresses comprises: 
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executing a series of instructions, each instruction to extract an address 
index for one of said plurality of data elements. 

22. (original) The method as in claim 21 wherein said address indices are 
extracted from a series of contiguous memory locations 

23. (cancelled) 

24. (new) The method as in claim 1 wherein the distances between a 
plurality of two neighboring indices within the plurality of said indices are of 
varying lengths. 

25. (new) The method as in claim 8 wherein the distances between a 
plurality of two neighboring indices within the plurality of said indices are of 
varying lengths. 

26. (new) The method as in claim 14 wherein the distances between a 
plurality of two neighboring indices within the plurality of said indices are of 
varying lengths. 
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